BALZR3.IJC[ESS,JMC] - www.SailDart.org

perm filename BALZR3.IJC[ESS,JMC] blob sn#026602 filedate 1973-02-15 generic text, type T, neo UTF8

pdq

A GLOBAL VIEW OF AUTOMATIC PROGRAMMING

ROBERT M. BALZER

USC INFORMATION SCIENCES INSTITUTE

ISI/RR-73-9

4676 ADMIRALTY WAY

MARINA DEL REY, CALIFORNIA 90291

(213) 822-1511

FEBRUARY 1973

This research is supported by the Advanced Research Projects

Agency under Contract No. DAHC15 72 C 0308, ARPA Order No.

2223/1, Program Code No. 3D30 and 3P10.

Views and conclusions contained in this study are the author's

and should not be interpreted as representing the official

opinion or policy of the University of Southern California or any

other person or agency connected with it.

A Global View of Automatic Programming Page 2
Robert M. Balzer

PREFACE AND ACKNOWLEDGEMENT

This paper represents the author's personal view of a global

description of Automatic Programming. This view resulted from

the author's discussions with and suggestions from numerous

colleagues on this area for which he is deeply indebted.

This paper is a condensation of this view, taken from a

larger work [1] which attempts to structure the field on the

conceptualization expressed here.

A Global View of Automatic Programming Page 3
Robert M. Balzer

INTRODUCTION

The goals of automatic programming are deceptively simple;

namely, the effective utilization of computers. This implies

both simplicity of use and efficient application of the computing

resources.

Thus, automatic programming is the application of a

computing system to the problem of effectively utilizing that or

another computing system in the performance of a task specified

by the user. Although this is certainly what we mean by

automatic programming, this definition does little to restrict

the set of applicable computer systems included in the automatic

programming domain. All compilers, operating systems, debugging

systems, text editors, etc., fall into this domain and, in fact,

the term 'automatic programming' itself was applied to early

compiler systems during the 1950s.

What we need therefore is to either appropriately restrict

the definition of automatic programming, or to provide a set of

criteria used for measuring the acceptability or performance of

an automatic programming system. Such a measure of system merit

is extremely hard to arrive at but would contain the following

man, machine, and system applicability components:

The efficiency of the resulting program;

The amount of computer resources expended in

arriving at that program;

A Global View of Automatic Programming Page 4
Robert M. Balzer

The elapsed time used in arriving at the resulting

program;

The amount of effort expended by the user in

specifying the task;

The reliability and ruggedness of the resulting

program;

The ease with wich future modifications can be

incorporated;

and finally,

The range and complexity of tasks which can be

handled by the system.

THE BASIC MODEL

We conjecture that the solution of every

computable problem can be represented entirely in

problem domain terms as a sequence, which may involve

loops and conditionals, of actions in that domain which

affect a data base of relationships between the

entities of the domain. Included either as part of the

data base or as a separate part of the model, is the

history of the model (i.e., the sequence of actions

applied to the model). This logically completes the

model and enables questions or actions involving

historical information to be handled. In a strong

sense, such a solution is a direct simulation of the

A Global View of Automatic Programming Page 5
Robert M. Balzer

domain. The system models at each step what would

occur in the domain.

The important part of the above conjecture is that

any computable problem can be solved, and hence

described, in problem domain terms. This enables us to

divide the solution into two parts, an external and an

internal part. The external part is the problem

specification given by the user in completely domain

specific terms. The requirements for such users is no

longer a comprehensive knowledge of computers, but

rather the ability to completely characterize the

relevant relationships between entities of the problem

domain and the actions in that domain. In addition,

such users should have a rough awareness of the problem

solving capability of the system so that they can

provide additional help where needed in the form of

more appropriate macro-actions, recommendations about

the use of certain actions, and/or imperative sequences

which will solve part or all of the problem in problem

related terms.

The internal part is first concerned with finding

a solution in problem related terms, if this has not

already been provided by the user. Second, this part

is concerned with finding efficient solutions given the

available computing resources. Such optimizations

A Global View of Automatic Programming Page 6
Robert M. Balzer

occur at two levels beyond what we normally consider

optimization. First, at the problem level, recognition

that certain entities and/or relationships are

irrelevant enables their removal from the model.

Second, since only part of the state of the modelled

domain is required, and only at certain points in the

solution process, rather than simulating the model

completely at each step, the system can employ

alternative representations which require less

maintenance and which either directly mirror the

required part of the domain state or allow such parts

to be computationally inferred. Such representations

may also enable more direct solution of the problem.

It is these optimizations which form the main

distinction between the code generation part of an

Automatic Programming system and current

state-of-the-art compilers.

Thus, our definition of an Automatic Programming

system is one which accepts a problem in terms of a

complete model of the domain, which obtains a solution

for the problem in terms of this model, and produces an

efficient computer implementation of this solution in

the form of a program.

.SYSTEM REQUIREMENTS

There are eight facilities that we desire in

A Global View of Automatic Programming Page 7
Robert M. Balzer

Automatic Programming Systems. The first is an

interactive system. We want interaction between the

system and the user so that the specification can be

given incrementally and any errors or discrepancies

that arise in or from such specification can be handled

as they occur.

Along with this interactiveness we also expect the

system to be very forgiving. It should allow great

flexibility in the way and time at which information is

specified. It should also be forgiving by allowing the

user to change or retract things that he has specified.

The second criteria is the amount of

non-proceduralness used in the specification of the

task to be performed. As far as possible we should

tell the system what it is we want to do rather than

how to do that process. There is a continuum between

the statement of a problem in terms of its initial

state and its goal state and a specification of how to

do it in a machine language. Most of computer languge

development can be viewed as a movement from specifying

HOW to do something towards a statement of WHAT is

desired. The level of non-proceduralness achievable

within an Automatic Programming Sytem is directly

related to the system's capability of turning goals

into actions and this is dependent upon its knowledge

A Global View of Automatic Programming Page 8
Robert M. Balzer

of how to acieve certain results in the problem domain.

We use the ability of the system to achieve results in

the problem domain as the main distinction between

non-procedural and procedural languages. Thus we

require that the problems be stated in a language

appropriate for that domain, i.e., one that can express

the structure and interrelationships of the entities

within that domain, and one that users are familiar

with for discussing and describing tasks and problems

within that domain. Only with such a language can the

system know how to achieve the desired results rather

than being directly told how to produce the desired

result. On the other hand, some actions can best be

described in terms of how to accomplish them rather

than by the resulting state. We have no bias against

or for such descriptions as long as they are specified

entirely in problem domain terms rather than

implementation computer terms.

Along with a non-procedural language for

specifying problems we also want to reduce the amount

that must be specified for the system to correctly

process the problem. This can best be done by removing

details from the specification and allowing the system

to fill them in. As with non-proceduralness there is a

continuum here but the level that we are aiming for is

one which omits from the specification all references

A Global View of Automatic Programming Page 9
Robert M. Balzer

to any entities not contained within the problem

domain. Specifically, we wish to avoid references to

the data structures of computer science; lists, rings,

arrays, symbol tables, and the like, that have been

invented as a means of specifying 'how' rather than

'what' to do.

Next we need a mechanism to enable us to change

specifications that have been previously entered either

because they don't work or because we want them to do

different things, or because the environment in which

they are operating has changed. Such specifications

should be given in terms of changes in desired results.

Again, entirely within the language of the problem

domain itself.

Once a problem has been specified we need a

mechanism for insuring that the system produced is the

one that we desired. This is especially critical since

the system will be performing much augmentation for us.

Discrepancies between the desired system and the one

produced can arise from an inadequate knowledge of the

system about the domain, from a misstatement by the

user, or from a misinterpretation of something that was

specified. Whatever the cause, it is important that

the user can see the produced system in operation in

his own terms, i.e., in the language of the problem

A Global View of Automatic Programming Page 10
Robert M. Balzer

domain itself, so that he can check the expected

behavior against the behavior produced. In addition,

we expect that the system should be able to generate

test input data so that a wide range of behavior of the

task specified can be observed by the user. If a

discrepancy is found the user additionally requires

capabilities to locate and isolate the source of the

discrepancy and then to modify it to obtain the desired

result.

After a correct program has been produced we need

a mechanism for the system to make this program

efficient. Such efficiency will rest on two kinds of

information. First, knowledge about the problem domain

which enables alternative ways of performing the same

task to be evaluated. Secondly, information about

efficient ways to utilize computers so that a total

cost can be assigned to each of the different ways of

solving the problem in domain terms. What we have

implied but not stated in the above is that the goal

Automatic Programming systems are not just providing

facilities to automate programming, but facilities

which help the user in moving from an understanding of

a task to be accomplished to a finished running system

which performs that task, i.e., an Automatic

Programming system is a system which aids the user in

all the steps from problem definition and design to

A Global View of Automatic Programming Page 11
Robert M. Balzer

final completed running programs.

The final criteria for the Goal Automatic

Programming system is the range of their applicability.

To meet all the above criteria the system requires

detailed knowledge about the problem domain. The

requirement for this knowledge limits the system

applicability to other areas, and hence one measure of

such systems is the range of problem domains which they

can adequately handle and their method of obtaining

this knowledge.

A Global View of Automatic Programming Page 12
Robert M. Balzer

ALTERNATIVE DEFINITIONS AND APPROACHES

In defining Automatic Programming we started with

a goal we wished to accomplish, namely, reducing the

effort required to get a task running on a computer.

From this we adopted a framework in which the external

characteristics of Automatic Programming Systems could

be described in terms of:

1. The terms in which the problem is stated;

2. The method and time at which the system acquires

the

knowledge of the problem domain;

3. The characteristic of the resulting program.

The choices we made are:

1. Problem statement in natural language in terms of

the

problem domain.

2. Knowledge about the domain acquired interactively

natural language in terms of the complete model of

the

problem domain.

3. Resulting programs which are optimized with

respect

to data representations, control structure, and

code.

A Global View of Automatic Programming Page 13
Robert M. Balzer

Other choices could certainly be made, and the

resulting systems might look very different. One

particular set of choices, suggested by Al Perlis,

represents a system based on a completely different

paradigm. His system is predicated on incremental

growth from an accepted base, namely, FORTRAN. This

idea is to increase the declarative parts of the

language at the expense of the procedural. The

declarative parts are in turn replaced by a series of

questions from the system which specify how a defined

concept is to be used. For instance, the concept

'array' might generate queries to see whether the size

was dynamic at run time, whether insertions or

deletions are being done, whether the elements are

homogeneous, and whether they are accessed

sequentially. From such questions and the programs

which use these concepts, an optimal representation can

be chosen.

In this system, higher levels occur when enough

example systems have been generated and understood by

humans that someone can codify this knowledge and

introduce a new level of semantics and questions.

There is no doubt that such a system would be

useful, and it has the appealing attribute that the

facilities of the system can be incrementally expanded

A Global View of Automatic Programming Page 14
Robert M. Balzer

from a widely distributed and available base. Also,

such a system could instantly be used by existing

programmers who could gradually learn to use the new

facilities on top of their ability to use FORTRAN.

On the other hand, it is not clear how far such a

technique can be pushed, and whether this is the best

way to achieve our goal except in the short-term. This

and other choices and approaches should, however, be

investigated.

A Global View of Automatic Programming Page 15
Robert M. Balzer

THE FOUR PHASES: AN OVERVIEW

Automatic Programming begins with the application

of problem solving to problem statements rather than

problem solutions; i.e., with the attempt by a computer

system to obtain an understanding of the task being

specified. Once the task has been understood, if it is

not in process form, it must be transformed into one.

This is the traditional area of Artificial Intelligence

and human program design. It must be verified that the

resulting process model is the one desired by the user

and that it is adequate for the user's problem. If

not, it must be modified and transformed by the above

steps and reverified. It must then be made into an

efficiently running program. This involves the

automation of the ad-hoc knowledge of computer science.

A complete Automatic Programming system thus

consists of four major phases: Problem Acquisition,

Process Transformation, Model Verification, and

Automatic Coding. Problem Acquisition is the process

by which the system obtains (1) a description of the

problem to be solved or task to be performed in a form

processable by the system, and (2) the knowledge needed

to solve the problem. The result of this phase is a

well formed problem and knowledge base which can be

manipulated by the system and transformed into a high

A Global View of Automatic Programming Page 16
Robert M. Balzer

level process for solving the problem during the second

phase. The third phase is used to verify that this

process is the one desired and that it is adequate for

the problem solution. The fourth phase, Automatic

Coding, fills in the necessary details, optimizes the

process, and produces the actual code to solve the

problem.

A Global View of Automatic Programming Page 17
Robert M. Balzer

THE AUTOMATIC PROGRAMMING MODEL

One of the most striking and deep rooted features

of the Automatic Programming model presented here is

the interface it creates between a high level external

specification of a problem which omits data structures

which are not part of the domain and the internal

implementation of that specification is an efficient

representation.

This choice of a basic interface has predicated

large parts of the entire model. The reason for this

choice as the basic interface within the Automatic

Programming model is that through it we expect to gain

in four important ways.

First, the complete model conjecture states that

such a division is feasible for stating and solving

domain dependent problems.

Second, since the choice of a data representation

and the maintenance of its consistency occupy such a

large portion of current programs we expect that the

size and complexity of specifications without such

representations will be drastically reduced over those

which retain such representations.

Third, since so much detail has been removed from

the specification it is easier for the system to

A Global View of Automatic Programming Page 18
Robert M. Balzer

understand what the task is rather than getting lost in

the details of what is going on.

Finally, since the problem has not been

overspecified with a particular choice of

representation so that the problem was expressible, the

system is now free to choose a representation that will

efficiently solve the problem at hand. The system has

been given increased flexibility in its choice and may

well outperform humans in correctly making

representation choices. Not because the system is more

intelligent than the user, but because it can cycle

through more possibilities and bring to bear a greater

level of effort in such optimizations than any user is

willing or able to invest in such issues.

A Global View of Automatic Programming Page 19
PROBLEM ACQUISITION

PROBLEM ACQUISITION

The Problem Acquisition phase is concerned with

obtaining an understanding of the users problem and the

domain in which it exists so that the Process

Transformation phase can attempt to find a sequence of

transformations or operations in that domain which will

obtain the solution required by the user. Thus, the

Problem Acquisition phase is concerned with building

the complete model of the users domain which represents

the interactions between the entities of that domain

and the effect on those entities by the allowed

transformations or operations applicable within the

domain. It is our primary contention that only through

the development of such a complete model of the user's

domain can the Automatic Programming system have any

degree of generality in the domains for which it is

applicable.

Currently, all such models of user domains have

been coded into a system. We are proposing that such

models can be specified to the system by its users and

that through these models the system can acquire the

knowledge necessary to solve problems within these

domains and to understand what is required for such a

solution. The two main issues, then, are what

constitutes an adequate and appropriate model and how

A Global View of Automatic Programming Page 20
PROBLEM ACQUISITION

is such a model specified or communicated to the

system.

There are basically two types of models,analogical

and fregean. Analogical models bear a strong

resemblence to the structure of the object being

described, such as the floor plan for a room, or the

diagram model used by Gerlernter in his geometry

proving program.

Fregean systems, on the other hand, are linguistic

or relation based, in which expressions are built up on

the relation between functions and arguments to those

functions.

Since one of the basic goals of the Automatic

Programming system is the generality of problem domains

that it is willing and capable of handling we have

chosen to take the fregean model approach.

We can now define the adequacy and appropriateness

of the models for the Automatic Programming system. A

fregean model is adequate if it contains a complete

enough description of the relations between the

entities in the problem domain that a sequence of

operations or transformations on this model can be

built to solve the problem posed by the user. This is

what we referred to as the complete model in the

A Global View of Automatic Programming Page 21
PROBLEM ACQUISITION

Desiderata Chapter. Thus, the adequacy of a model is

dependent upon the use of that model to solve the

problem. Operationally, this requires that the

Automatic Programming system is capable of finding the

complete set of applicable transformations on the model

and can calculate the consequences of each of these

actions. The appropriateness of the model is a measure

of how well suited the available transformations are to

solving the problem at hand, i.e., an adequate model

can be made more appropriate by adding to it

non-primitive transformations made up of a sequence of

primitive ones, which are suitable building block for

the problem being posed. The model may also be made

more appropriate by including recommendations about the

suitability of alternative strategies for sequences of

model transformations.

We expect that the well-known problem of building

a powerful general purpose problem solver will be

significantly aided by the user's tailoring of the

specified model to make it more appropriate for the

problem at hand.

The state of the art in natural language

understanding is adequate for the description of

problems and of models and is beyond our ability to

utilize the information thus obtained, and hence,

A Global View of Automatic Programming Page 22
PROBLEM ACQUISITION

should not be a bottleneck in an Automatic Programming

system. Evidence for this viewpoint comes from the

work of Woods in the Moonrocks Program, from the work

of Winograd in the Blocks Description Program, and the

work of Martin Kay in the Mind System. Each of these

systems represents an alternative linguistic technology

and each is capable of handling a wide range of

linguistic forms within the domain of its competence.

The basic viewpoint then is to process the users

natural language communication with the understanding

that it is meant to convey to the Automatic Programming

system a model of his problem domain. Towards this end

the system can extract entities and the relationships

between them from the communication. It can further

query the user as to the relationships between entities

which have not, as yet, been explicitly specified but

which have been inferred by the previous communication.

Such inferences by the system about the incompleteness

of the model require a sophisticated understanding not

only of the communication but of the types of models

used for problem domain specification. Unfortunately,

our sophistication in both these areas is quite

limited. In communication we need to be able to

understand how information is ordered for presentation,

how context is established and utilized, how the

capabilities of the recipient effects the communication

A Global View of Automatic Programming Page 23
PROBLEM ACQUISITION

and how these capabilities are perceived by the

speaker. In modelling we need to have a space of

possible models, an understanding of how the parts of a

model interact, a means for recognizing incompleteness

in models, as well as inconsistencies, a means for

obtaining all the allowed operations on the model, and

the means for transforming the models with these

operations.

PROCESS TRANSFORMATION

Our contention is that the main activity in

programming is not finding a solution but in finding a

solution which drops out the irrelevancies and which

abstracts the necessary processing so that it can be

efficiently implemented. We recognize that this is a

strong contention but we feel that in most programming

problems a solution is known and the main concern is in

finding a more efficient one. We are not talking here

about optimization in the normal sense. We are

concerned here with finding irrelevancies in the

complete model and representational abstractions based

on the required processing of that model. Once these

logical representations have been found they must be

efficiently implemented.

The above contention, if true, greatly shifts the

emphasis within the Process Transformation Phase from

A Global View of Automatic Programming Page 24
PROBLEM ACQUISITION

that of a general problem solver solving problems in a

domain independent way to modifying a solution so that

it does not maintain any irrelevant portions of the

complete model and which abstracts the relevant

portions into a more efficient representation for the

processing required. Together with Problem Acquisition

the ability to find representational abstractions and

transform complete model solutions into ones which

utilize these representations represent the main

technological deficiencies with obtaining an Automatic

Programming system.

MODEL VERIFICATION

Although we can be assured that the Automatic

Coding phase will produce only correct code, Program

Testing cannot disappear. This is because the Problem

Acquisition Phase and the Process Transformation Phase

will undoubtedly employ a number of heuristics and may

very well incorrectly interpret either the problem

statement or the allowed transformations that can occur

in the user's model. Because of this, the user must

verify that the system created is the one that he

desired.

The technology for this is at hand. It consists

of today's methods wherein a test case is given to the

system and its performance is used to validate the

A Global View of Automatic Programming Page 25
PROBLEM ACQUISITION

model that it constructed. Additionally, the system

can aid the process by generating test cases of its own

which probe areas of which it is uncertain and which

could have led to either misunderstanding or

incompleteness in the original model. One might also

expect that program debugging would disappear, but for

very similar reasons it too will remain under Automatic

Programming. If there is a disparity between the users

model and the system's model, then the reason for this

disparity must be obtained.

AUTOMATIC CODING

Automatic Coding is concerned with finding an

efficient computer implementation of the process

description obtained from the proceeding phase. This

description does not yet include a choice of data

representations, but does specify the major processing

elements and sequences. It is intended that this phase

will not need any domain specific knowledge except for

input frequency and distribution information. The

major logical representation and processing decisions

have already been made by the Process Transformation

phase.

Of all the phases in the Automatic Programming

System, the Automatic Coding one is the one essential

component of any Automatic Programming System. Without

A Global View of Automatic Programming Page 26
PROBLEM ACQUISITION

it the system cannot produce programs, and hence,

though it may be useful it is not an Automatic

Programming System.

Most people are not truly creative when they

reorganize sections of their program to increase

efficiency. Rather than inventing totally new

representations they appear to select one out of an

ill-defined set of such possible representations and to

adapt and modify it to function in the current

situation. This is probably the main challenge to the

Automatic Coding phase, the ability not only to cycle

through a set of alternative representations but to

adapt and modify them to the existing situation. Such

an ability would vastly increase the applicableness of

a small set of alternative representations.

From such Automatic Coding studies, one would

expect to see both a set of heuristics and a calculus,

eventually, for data representation choices.

A Global View of Automatic Programming Page 27
PROBLEM ACQUISITION

REFERENCES

1. AUTOMATIC PROGRAMMING, R. M. BALZER Institute

Technical Memorandum ISI/RR-1 USC INFORMATION SCIENCES

INSTITUTE September 1972